Background

The file ‘WeatherDownloads_202005_v002.Rmd’ contains code for dowloading and processing historical weather data as contained in METAR archives hosted by Iowa State University.

Data have been dowloaded and processed for several stations (airports) and years, with .rds files saved in “./RInputFiles/ProcessedMETAR”.

This module will perform exploratory data analysis on the processed weather files.

Data Availability

Each processed data file contains one year of hourly weather data for one station. Files are saved as ‘./RInputFiles/ProcessedMETAR/metar_kxxx_yyyy.rds’ where xxx is the three-digit airport code and yyyy is the four-digit year.

Each file contains the following variables:

  • METAR (chr) - the extracted portion of the METAR based on a regex string
  • WindDir (chr) - the previaling wind direction in degrees, stored as a character since ‘VRB’ means variable
  • WindSpeed (int) - the prevailing wind speed in knots
  • WindGust (dbl) - the wind gust speed in knots (NA if there is no recorded wind gust at that hour)
  • Dummy (chr) - artifact, always a blank space
  • Visibility (dbl) - surface visibility in statute miles
  • TempC (int) - temperature in degrees Celsius
  • DewC (int) - dew point in degrees Celsius
  • Altimeter (int) - altimeter in inches of mercury
  • SLP (int) - the raw sea-level-pressure reading from the METAR
  • FahrC (chr) - the raw temperature string pulled from the METAR (Tttttdddd) where tttt is the Fahrenheit temperature recorded in Celsius and dddd is the Fahrenheit dew point recorded in Celsius
  • dtime (dttm) - the date-time associated with the observation
  • origMETAR (chr) - the full METAR associated with the observation
  • TempF (dbl) - the Fahrenheit temperature associated with converting FahrC to Fahrenheit
  • DewF (dbl) - the Fahrenheit dew point associated with converting FahrC to Fahrenheit
  • modSLP (dbl) - Sea-Level Pressure (SLP), adjusted to reflect that SLP is recorded as 0-1000 but reflects data that are 950-1050
  • nSKC (int) - number of times ‘SKC’ (human-confirmed cloud-free) is recorded in the observation (should be 0 or 1)
  • nCLR (int) - number of times ‘CLR’ (austomated-sensor cloud-free) is recorded in the observation (should be 0 or 1, and should never have both nSKC>0 and nCLR>0)
  • cloudn (chr) - the nth cloud layer recorded in the METAR (layers begin with FEW, SCT, BKN, OVC or VV)
  • cTypen (chr) - the cloud type of the nth cloud layer (FEW, BKN, SCT, OVC, or VV)
  • cLeveln (dbl) - the cloud height in feet of the nth cloud layer
  • wType (fct) - highest level of obscuration recorded in the METAR (VV > OVC > BKN > SCT > FEW > CLR/SKC)
  • year (dbl) - year of the observation
  • monthint (dbl) - month of the observation as a number (e.g., 6=June)
  • month (fct) - month of the observation as three-character abbreviation, saved as a factor (e.g., Jun=June)
  • day (int) - day of the month of the observation

Base Functions Available

There are several functions available for analysis:

  • plotCountsByMetric() - bar plots for counts by variable

  • plotNumCor() - plot two numeric variables against each other

  • plotFactorNumeric() - boxplot a numeric variable against a factor variable

  • corMETAR() - correlations between METAR variables

  • lmMETAR() - linear regression modeling for METAR variables

  • basicWindPlots() - plot wind speed and direction

  • getWindDirGroup() - convert wind direction to a grouping (e.g., N for 320-360-40)

  • consolidatePlotWind() - show frequency plots of wind direction, city, and month

The tidyverse library is loaded and the 2016 Detroit data is read in to show examples of the functions:

library(tidyverse)
## -- Attaching packages ------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.2.1     v purrr   0.3.3
## v tibble  2.1.3     v dplyr   0.8.4
## v tidyr   1.0.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## -- Conflicts ---------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
kdtw_2016 <- readRDS("./RInputFiles/ProcessedMETAR/metar_kdtw_2016.rds")

A variable mapping is created to allow for better readable variable names:

varMapper <- c(WindDir="Wind Direction (degrees)", 
               WindSpeed="Wind Speed (kts)", 
               Visibility="Visibility (SM)", 
               TempC="Temperature (C)", 
               DewC="Dew Point (C)", 
               Altimeter="Altimeter (inches Hg)", 
               modSLP="Sea-Level Pressure (hPa)", 
               TempF="Temperature (F)",
               DewF="Dew Point (F)", 
               cType1="First Cloud Layer Type", 
               cLevel1="First Cloud Layer Height (ft)",
               month="Month", 
               wType="Greatest Sky Obscuration", 
               day="Day of Month"
               )

The function plotCountsByMetric() produces bar plots for counts by variable:

# Helper function for generating plots by key variables
plotcountsByMetric <- function(df, 
                               mets, 
                               title="", 
                               rotateOn=20, 
                               dropNA=TRUE, 
                               diagnose=FALSE,
                               mapper=varMapper
                               ) {
    
    # Plot of counts by key metric
    for (x in mets) {
        dat <- df %>%
            group_by_at(vars(all_of(x))) %>%
            summarize(n=n())
        
        if (dropNA) {
            nOrig <- nrow(dat)
            sumOrig <- sum(dat$n)
            dat <- dat %>%
                filter_all(all_vars(!is.na(.)))
            if (diagnose & (nOrig > nrow(dat))) { 
                cat("\nDropping", 
                    nOrig-nrow(dat), 
                    "rows with", 
                    sumOrig-sum(dat$n), 
                    "observations due to NA\n"
                    )
            }
        }
        
        p <- dat %>%
            ggplot(aes_string(x=x, y="n")) + 
            geom_col() + 
            labs(title=title,
                 subtitle=paste0("Counts By: ", mapper[x]), 
                 x=paste0(x, " - ", mapper[x]),
                 y="Count"
                 )
        if (nrow(dat) >= rotateOn) {
            p <- p + theme(axis.text.x=element_text(angle=90))
        }
        print(p)
    }
}

# Example for Detroit 2016 - using WindDir, cType1, month, wType
plotcountsByMetric(kdtw_2016, 
                   mets=c("WindDir", "cType1", "month", "wType"), 
                   title="Detroit, MI (2016)"
                   )

The function plotNumCor() plots two numeric variables against one another:

# Create a function for plotting two variables against each other
plotNumCor <- function(met, 
                       var1, 
                       var2, 
                       title=NULL, 
                       subT="", 
                       dropNA=TRUE, 
                       diagnose=FALSE,
                       mapper=varMapper
                       ) {
    
    # Create the title if not passed
    if (is.null(title)) { 
        title <- paste0("Hourly Observations of ", mapper[var1], " and ", mapper[var2]) 
    }
    
    # Pull the counts by var1-var2
    dat <- met %>%
        group_by_at(vars(all_of(c(var1, var2)))) %>%
        summarize(n=n()) 
    
    # If NA requested to be excluded, remove anything with NA
    if (dropNA) {
        nOrig <- nrow(dat)
        sumOrig <- sum(dat$n)
        dat <- dat %>%
            filter_all(all_vars(!is.na(.)))
        if (diagnose) { 
            cat("\nDropping", nOrig-nrow(dat), "rows", "with", sumOrig-sum(dat$n), "observations due to NA\n")
        }
    }
    
    p <- dat %>%
        ggplot(aes_string(x=var1, y=var2)) + 
        geom_point(alpha=0.5, aes_string(size="n")) + 
        geom_smooth(method="lm", aes_string(weight="n")) + 
        labs(x=paste0(mapper[var1], " - ", var1), 
             y=paste0(mapper[var2], " - ", var2), 
             title=title, 
             subtitle=subT
             )
    
    print(p)
}

# Example for Detroit 2016 - using TempC and TempF
plotNumCor(kdtw_2016, var1="TempC", var2="TempF", subT="Detroit, MI (2016)", diagnose=TRUE)
## 
## Dropping 1 rows with 49 observations due to NA

# Example for Detroit 2016 - using TempC and DewC
plotNumCor(kdtw_2016, var1="TempC", var2="DewC", subT="Detroit, MI (2016)", diagnose=TRUE)
## 
## Dropping 1 rows with 49 observations due to NA

# Example for Detroit 2016 - using Altimeter and modSLP
plotNumCor(kdtw_2016, var1="Altimeter", var2="modSLP", subT="Detroit, MI (2016)", diagnose=TRUE)
## 
## Dropping 1 rows with 49 observations due to NA

The function plotFactorNumeric() creates box plots for a numeric variable against a factor variable:

# Updated function for plotting numeric by factor
plotFactorNumeric <- function(met, 
                              fctVar, 
                              numVar, 
                              title=NULL, 
                              subT, 
                              diagnose=TRUE,
                              showXLabel=TRUE,
                              mapper=varMapper
                              ) {
    
    # Create the title if not passed
    if (is.null(title)) { 
        title <- paste0("Hourly Observations of ", mapper[numVar], " by ", mapper[fctVar])
    }
    
    nOrig <- nrow(met)
    dat <- met %>%
        filter(!is.na(get(fctVar)), !is.na(get(numVar)))
    if (diagnose) { cat("\nRemoving", nOrig-nrow(dat), "records due to NA\n") }
    
    p <- dat %>%
        ggplot(aes_string(x=fctVar, y=numVar)) + 
        geom_boxplot(fill="lightblue") + 
        labs(title=title, 
             subtitle=subT, 
             x=ifelse(showXLabel, paste0(mapper[fctVar], " - ", fctVar), ""), 
             y=paste0(mapper[numVar], " - ", numVar)
             )
    
    print(p)
    
}

# Example for Detroit 2016 - using TempF and month
plotFactorNumeric(kdtw_2016, 
                  fctVar="month", 
                  numVar="TempF", 
                  subT="Detroit, MI (2016)", 
                  showXLabel=FALSE,
                  diagnose=TRUE
                  )
## 
## Removing 49 records due to NA

# Example for Detroit 2016 - using WindSpeed and wType
plotFactorNumeric(kdtw_2016, 
                  fctVar="wType", 
                  numVar="WindSpeed", 
                  subT="Detroit, MI (2016)", 
                  showXLabel=TRUE,
                  diagnose=TRUE
                  )
## 
## Removing 49 records due to NA

# Example for Detroit 2016 - using Visibility and wType
plotFactorNumeric(kdtw_2016, 
                  fctVar="wType", 
                  numVar="Visibility", 
                  subT="Detroit, MI (2016)", 
                  showXLabel=TRUE,
                  diagnose=TRUE
                  )
## 
## Removing 49 records due to NA

An issue previous observed where visibility 1/16SM was interpreted as 16 statutory miles has been corrected in the ‘WeatherDownloads_202005_v002’ file.

The function corMETAR() calculates correlations among numeric variables in the METAR data:

# Function to calculate, display, and plot variable correlations
corMETAR <- function(met, numVars, subT="") {

    # Keep only complete cases and report on data kept
    dfUse <- met %>%
        select_at(vars(all_of(numVars))) %>%
        filter(complete.cases(.))
    
    nU <- nrow(dfUse)
    nM <- nrow(met)
    myPct <- round(100*nU/nM, 1)
    cat("\n *** Correlations use ", nU, " complete cases (", myPct, "% of ", nM, " total) ***\n", sep="")
    
    # Create the correlation matrix
    mtxCorr <- dfUse %>%
        cor()

    # Print the correlations
    mtxCorr %>%
        round(2) %>%
        print()

    # Display a heat map
    corrplot::corrplot(mtxCorr, 
                       method="color", 
                       title=paste0("Hourly Weather Correlations\n", subT), 
                       mar=c(0, 0, 2, 0)
                       )
}

# Example for Detroit, MI 2016
coreNum <- c("TempC", "TempF", "DewC", "DewF", 
             "Altimeter", "modSLP", "WindSpeed", "Visibility"
             )
corMETAR(kdtw_2016, numVars=coreNum, subT="Detroit, MI (2016) METAR")
## 
##  *** Correlations use 8769 complete cases (99.4% of 8818 total) ***
##            TempC TempF  DewC  DewF Altimeter modSLP WindSpeed Visibility
## TempC       1.00  1.00  0.92  0.92     -0.17  -0.24     -0.10       0.14
## TempF       1.00  1.00  0.92  0.92     -0.17  -0.24     -0.10       0.14
## DewC        0.92  0.92  1.00  1.00     -0.22  -0.28     -0.18      -0.07
## DewF        0.92  0.92  1.00  1.00     -0.22  -0.28     -0.18      -0.07
## Altimeter  -0.17 -0.17 -0.22 -0.22      1.00   1.00     -0.37       0.15
## modSLP     -0.24 -0.24 -0.28 -0.28      1.00   1.00     -0.35       0.14
## WindSpeed  -0.10 -0.10 -0.18 -0.18     -0.37  -0.35      1.00       0.10
## Visibility  0.14  0.14 -0.07 -0.07      0.15   0.14      0.10       1.00

The function lmMETAR() runs simple linear regression models on the METAR data:

# Function for linear regressions on METAR data
lmMETAR <- function(met, 
                    y, 
                    x, 
                    yName, 
                    subT=""
                    ) {
    
    # Convert to formula
    myChar <- paste0(y, " ~ ", x)
    cat("\n *** Regression call is:", myChar, "***\n")
    
    # Run regression
    regr <- lm(formula(myChar), data=met)
    
    # Summarize regression
    print(summary(regr))
    
    # Predict the new values
    pred <- predict(regr, newdata=met)
    
    # Plot the predictions
    p <- met %>%
        select_at(vars(all_of(y))) %>%
        mutate(pred=pred) %>%
        filter_all(all_vars(!is.na(.))) %>%
        group_by_at(vars(all_of(c(y, "pred")))) %>%
        summarize(n=n()) %>%
        ggplot(aes_string(x=y, y="pred")) + 
        geom_point(aes(size=n), alpha=0.25) + 
        geom_smooth(aes(weight=n), method="lm") + 
        labs(title=paste0("Predicted vs. Actual ", yName, " - ", x, " as Predictor"), 
             subtitle=subT, 
             x=paste0("Actual ", yName), 
             y=paste0("Predicted ", yName)
             )
    print(p)
    
}

# Examples for Detroit, MI 2016
lmMETAR(kdtw_2016, "modSLP", "Altimeter", yName="Sea Level Pressure", subT="Detroit, MI (2016)")
## 
##  *** Regression call is: modSLP ~ Altimeter ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.96361 -0.44022 -0.03758  0.40772  1.41448 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -16.44389    0.74723  -22.01   <2e-16 ***
## Altimeter    34.41514    0.02488 1383.17   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4961 on 8767 degrees of freedom
##   (49 observations deleted due to missingness)
## Multiple R-squared:  0.9954, Adjusted R-squared:  0.9954 
## F-statistic: 1.913e+06 on 1 and 8767 DF,  p-value: < 2.2e-16

lmMETAR(kdtw_2016, "modSLP", "Altimeter + TempF", yName="Sea Level Pressure", subT="Detroit, MI (2016)")
## 
##  *** Regression call is: modSLP ~ Altimeter + TempF ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.57802 -0.12674  0.00481  0.12820  0.65708 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -3.713e+00  2.803e-01  -13.25   <2e-16 ***
## Altimeter    3.403e+01  9.302e-03 3658.80   <2e-16 ***
## TempF       -2.351e-02  9.941e-05 -236.48   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1826 on 8766 degrees of freedom
##   (49 observations deleted due to missingness)
## Multiple R-squared:  0.9994, Adjusted R-squared:  0.9994 
## F-statistic: 7.086e+06 on 2 and 8766 DF,  p-value: < 2.2e-16

lmMETAR(kdtw_2016, "TempC", "DewC", yName="Temperature (C)", subT="Detroit, MI (2016)")
## 
##  *** Regression call is: TempC ~ DewC ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -6.174 -3.172 -1.165  2.822 17.828 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.17084    0.05373   114.9   <2e-16 ***
## DewC         0.99909    0.00470   212.6   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.463 on 8767 degrees of freedom
##   (49 observations deleted due to missingness)
## Multiple R-squared:  0.8375, Adjusted R-squared:  0.8375 
## F-statistic: 4.519e+04 on 1 and 8767 DF,  p-value: < 2.2e-16

The basicWindPlots() function creates plots for wind speed and direction:

# Generate basic wind plots
basicWindPlots <- function(met, 
                           dirVar="WindDir", 
                           spdVar="WindSpeed",
                           desc="", 
                           gran="", 
                           mapper=varMapper
                           ) {

    # Plot for the wind direction
    wDir <- met %>%
        ggplot(aes_string(x=dirVar)) + 
        geom_bar() + 
        labs(title=paste0(desc, " Wind Direction"), subtitle=gran, 
             y="# Hourly Observations", x=mapper[dirVar]
             ) + 
        theme(axis.text.x=element_text(angle=90))
    print(wDir)

    # Plot for the minimum, average, and maximum wind speed by wind direction
    # Wind direction 000 is reserved for 0 KT wind, while VRB is reserved for 3-6 KT variable winds
    wSpeedByDir <- met %>%
        filter(!is.na(get(dirVar))) %>%
        group_by_at(vars(all_of(dirVar))) %>%
        summarize(minWind=min(get(spdVar)), meanWind=mean(get(spdVar)), maxWind=max(get(spdVar))) %>%
        ggplot(aes_string(x=dirVar)) +
        geom_point(aes(y=meanWind), color="red", size=2) +
        geom_errorbar(aes(ymin=minWind, ymax=maxWind)) +
        labs(title=paste0(desc, " Wind Speed (Max, Mean, Min) By Wind Direction"), 
             subtitle=gran,
             y=mapper[spdVar], 
             x=mapper[dirVar]
             ) + 
        theme(axis.text.x=element_text(angle=90))
    print(wSpeedByDir)

    # Plot for the wind speed
    pctZero <- sum(pull(met, spdVar)==0, na.rm=TRUE) / nrow(met)
    wSpeed <- met %>%
        filter_at(vars(all_of(spdVar)), all_vars(!is.na(.))) %>%
        ggplot(aes_string(x=spdVar)) +
        geom_bar(aes(y=..count../sum(..count..))) +
        labs(title=paste0(round(100*pctZero), "% of wind speeds in ", desc, " measure 0 Knots"),
             subtitle=gran,
             y="% Hourly Observations", 
             x=mapper[spdVar]
             )
    print(wSpeed)
    
    # Polar plot for wind speed and wind direction
    wData <- met %>%
        filter_at(vars(all_of(dirVar)), all_vars(!is.na(.) & !(. %in% c("000", "VRB")))) %>%
        filter_at(vars(all_of(spdVar)), all_vars(!is.na(.))) %>%
        mutate_at(vars(all_of(dirVar)), as.numeric) %>%
        group_by_at(vars(all_of(c(dirVar, spdVar)))) %>%
        summarize(n=n())
        
    wPolarDirSpeed <- wData %>%
        ggplot(aes_string(x=spdVar, y=dirVar)) +
        geom_point(alpha=0.1, aes(size=n)) +
        coord_polar(theta="y") +
        labs(title=paste0(desc, " Direction vs. Wind Speed"), 
             subtitle=gran, 
             x=mapper[spdVar], 
             y=mapper[dirVar]
             ) +
        scale_y_continuous(limits=c(0, 360), breaks=c(0, 90, 180, 270, 360)) +
        scale_x_continuous(limits=c(0, 40), breaks=c(0, 5, 10, 15, 20, 25, 30, 35, 40)) +
        geom_point(aes(x=0, y=0), color="red", size=2)
    print(wPolarDirSpeed)

}

# Example for Detroit, MI 2016
basicWindPlots(kdtw_2016, desc="Detroit, MI (2016)", gran="KDTW METAR")

The getWindDirGroup() function maps wind direction to a category such as NNE. Because the METAR data are recorded in units of 10 degrees, either 4 groupings (90 degrees each) or 12 groupings (30 degrees each) are preferred, so that each category has the same underlying number of buckets:

# Extract the wind direction data from a processed METAR file
getWindDirGroup <- function(met, src) {
    
    # Use the fullMETAR data and extract WindDir, WindSpeed, month
    windPlotData <- met %>%
        select(WindDir, WindSpeed, month) %>%
        mutate(windDirGroup=factor(case_when(WindSpeed==0 ~ "No Wind", 
                                             WindDir=="VRB" ~ "Variable", 
                                             WindDir %in% c("350", "360", "010") ~ "N", 
                                             WindDir %in% c("020", "030", "040") ~ "NNE", 
                                             WindDir %in% c("050", "060", "070") ~ "ENE", 
                                             WindDir %in% c("080", "090", "100") ~ "E", 
                                             WindDir %in% c("110", "120", "130") ~ "ESE",
                                             WindDir %in% c("140", "150", "160") ~ "SSE", 
                                             WindDir %in% c("170", "180", "190") ~ "S", 
                                             WindDir %in% c("200", "210", "220") ~ "SSW",
                                             WindDir %in% c("230", "240", "250") ~ "WSW", 
                                             WindDir %in% c("260", "270", "280") ~ "W", 
                                             WindDir %in% c("290", "300", "310") ~ "WNW", 
                                             WindDir %in% c("320", "330", "340") ~ "NNW", 
                                             TRUE ~ "Error"
                                             ), 
                                   levels=c("No Wind", "Variable", "Error", 
                                            "N", "NNE", "ENE", 
                                            "E", "ESE", "SSE", 
                                            "S", "SSW", "WSW", 
                                            "W", "WNW", "NNW"
                                            )
                                   )
               )
    
    # Rempve the errors and calculate percentages by month for the remainder
    processedWindData <- windPlotData %>%
        filter(windDirGroup != "Error") %>%
        group_by(month, windDirGroup) %>%
        summarize(n=n()) %>%
        ungroup() %>%
        group_by(month) %>%
        mutate(pct=n/sum(n)) %>%
        ungroup() %>%
        mutate(src=src)
    
    processedWindData

}

The function conslidatePlotWind() then calls getWindDirGroup() for any number of files:

# Consolidate and plot wind data
consolidatePlotWind <- function(files, names) {

    consFun <- function(x, y) { getWindDirGroup(met=x, src=y) }
    boundByRows <- map2_dfr(.x=files, .y=names, .f=consFun)

    # Show frequency by month for each city, faceted by wind direction
    p1 <- boundByRows %>%
        ggplot(aes(x=month, y=pct, color=src)) + 
        geom_line(aes(group=src)) + 
        facet_wrap(~windDirGroup) + 
        labs(title="Wind Frequency by Month", 
             x="Month", 
             y="Frequency of Wind Observations"
             ) +
        theme(axis.text.x=element_text(angle=90))
    print(p1)
    
    # Show frequency by wind direction for each city, faceted by month
    p2 <- boundByRows %>%
        ggplot(aes(x=windDirGroup, y=pct, color=src)) + 
        geom_line(aes(group=src)) + 
        facet_wrap(~month) + 
        labs(title="Wind Frequency by Wind Direction", 
             x="Wind Direction", 
             y="Frequency of Wind Observations"
             ) +
        theme(axis.text.x=element_text(angle=90))
    print(p2)
    
    boundByRows
    
}

# Load the Las Vegas data and New Orleans data for comparison
kmsy_2016 <- readRDS("./RInputFiles/ProcessedMETAR/metar_kmsy_2016.rds")
klas_2016 <- readRDS("./RInputFiles/ProcessedMETAR/metar_klas_2016.rds")

# Run wind by month comparisons for Detroit, Las Vegas, New Orleans
consolidatePlotWind(files=list(kdtw_2016, klas_2016, kmsy_2016), 
                    names=c("Detroit, MI (2016)", "Las Vegas, NV (2016)", "New Orleans, LA (2016)")
                    )

## # A tibble: 504 x 5
##    month windDirGroup     n     pct src               
##    <fct> <fct>        <int>   <dbl> <chr>             
##  1 Jan   No Wind         46 0.0600  Detroit, MI (2016)
##  2 Jan   Variable         3 0.00391 Detroit, MI (2016)
##  3 Jan   N               26 0.0339  Detroit, MI (2016)
##  4 Jan   NNE             26 0.0339  Detroit, MI (2016)
##  5 Jan   ENE              9 0.0117  Detroit, MI (2016)
##  6 Jan   E               16 0.0209  Detroit, MI (2016)
##  7 Jan   ESE             14 0.0183  Detroit, MI (2016)
##  8 Jan   SSE             72 0.0939  Detroit, MI (2016)
##  9 Jan   S              131 0.171   Detroit, MI (2016)
## 10 Jan   SSW            158 0.206   Detroit, MI (2016)
## # ... with 494 more rows

Combining Functions

The functions can be combined so that a full process can be run for a given file:

# File name to city name mapper
cityNameMapper <- c(kdtw_2016="Detroit, MI (2016)", 
                    kmsy_2016="New Orleans, LA (2016)", 
                    klas_2016="Las Vegas, NV (2016)"
                    )


# The following function runs the functions that work on a single data source
combinedEDA <- function(filename, 
                        desc=NULL,
                        mets=c("WindDir", "WindSpeed", "TempC", "DewC", "Altimeter", 
                               "modSLP", "cType1", "cLevel1", "month", "day"
                               ),
                        corPairs=list(c("TempC", "TempF"), 
                                      c("TempC", "DewC"), 
                                      c("Altimeter", "modSLP"), 
                                      c("Altimeter", "WindSpeed")
                                      ),
                        fctPairs=list(c("month", "TempF"), 
                                      c("month", "DewF"), 
                                      c("month", "WindSpeed"), 
                                      c("month", "Altimeter"), 
                                      c("wType", "Visibility"), 
                                      c("wType", "WindSpeed"), 
                                      c("WindDir", "WindSpeed"), 
                                      c("WindDir", "TempF")
                                      ),
                        heatVars=c("TempC", "TempF", 
                                   "DewC", "DewF", 
                                   "Altimeter", "modSLP", 
                                   "WindSpeed", "Visibility", 
                                   "monthint", "day"
                                   ),
                        lmVars=list(c("modSLP", "Altimeter"), 
                                    c("modSLP", "Altimeter + TempF"), 
                                    c("TempF", "DewF"), 
                                    c("WindSpeed", "Altimeter + TempF + DewF + month")
                                    ),
                        mapper=varMapper,
                        path="./RInputFiles/ProcessedMETAR/"
                        ) {
    
    # Find the description if it is NULL (default)
    if (is.null(desc)) {
        for (potMatch in names(cityNameMapper)) {
            if (str_detect(string=filename, pattern=potMatch)) {
                desc <- cityNameMapper[potMatch]
                break
            }
        }
        if (is.null(desc)) {
            desc <- paste0("UNMAPPED_", filename)
            cat("\nUnable to find a description, will use ", desc, "\n", sep="")
        } else {
            cat("\nWill use ", desc, " as the description for ", filename, "\n", sep="")
        }
    }
    
    # Read in the file
    tbl <- readRDS(paste0(path, filename))
    
    # Plot counts by metric
    plotcountsByMetric(tbl, mets=mets, title=desc, diagnose=TRUE)
    
    # Plot relationships between two variables
    for (ys in corPairs) {
        plotNumCor(tbl, var1=ys[1], var2=ys[2], subT=desc, diagnose=TRUE)
    }
    
    # plot numeric vs. factor
    for (ys in fctPairs) {
        plotFactorNumeric(tbl, fctVar=ys[1], numVar=ys[2], subT=desc, showXLabel=FALSE, diagnose=TRUE)
    }
    
    # Heatmap for variable correlations
    corMETAR(tbl, numVars=heatVars, subT=paste0(desc, " METAR"))

    # Run linear rergression
    for (ys in lmVars) {
        lmMETAR(tbl, y=ys[1], x=ys[2], yName=mapper[ys[1]], subT=desc)
    }
    
    # Run the basic wind plots
    basicWindPlots(tbl, desc=desc, gran="")
    
    # Return the tibble
    tbl
    
}

The process can then be run for each of Detroit (2016), Las Vegas (2016), and New Orleans (2016):

# Retrieve the Detroit, MI (2016) data
kdtw_2016 <- combinedEDA("metar_kdtw_2016.rds")
## 
## Will use Detroit, MI (2016) as the description for metar_kdtw_2016.rds
## 
## Dropping 1 rows with 49 observations due to NA

## 
## Dropping 1 rows with 49 observations due to NA

## 
## Dropping 1 rows with 49 observations due to NA

## 
## Dropping 1 rows with 49 observations due to NA

## 
## Dropping 1 rows with 49 observations due to NA

## 
## Dropping 1 rows with 49 observations due to NA

## 
## Dropping 1 rows with 651 observations due to NA

## 
## Dropping 1 rows with 49 observations due to NA

## 
## Dropping 1 rows with 49 observations due to NA

## 
## Dropping 1 rows with 49 observations due to NA

## 
## Dropping 1 rows with 49 observations due to NA

## 
## Removing 49 records due to NA

## 
## Removing 49 records due to NA

## 
## Removing 49 records due to NA

## 
## Removing 49 records due to NA

## 
## Removing 49 records due to NA

## 
## Removing 49 records due to NA

## 
## Removing 49 records due to NA

## 
## Removing 49 records due to NA

## 
##  *** Correlations use 8769 complete cases (99.4% of 8818 total) ***
##            TempC TempF  DewC  DewF Altimeter modSLP WindSpeed Visibility
## TempC       1.00  1.00  0.92  0.92     -0.17  -0.24     -0.10       0.14
## TempF       1.00  1.00  0.92  0.92     -0.17  -0.24     -0.10       0.14
## DewC        0.92  0.92  1.00  1.00     -0.22  -0.28     -0.18      -0.07
## DewF        0.92  0.92  1.00  1.00     -0.22  -0.28     -0.18      -0.07
## Altimeter  -0.17 -0.17 -0.22 -0.22      1.00   1.00     -0.37       0.15
## modSLP     -0.24 -0.24 -0.28 -0.28      1.00   1.00     -0.35       0.14
## WindSpeed  -0.10 -0.10 -0.18 -0.18     -0.37  -0.35      1.00       0.10
## Visibility  0.14  0.14 -0.07 -0.07      0.15   0.14      0.10       1.00
## monthint    0.24  0.24  0.32  0.32      0.19   0.17     -0.06      -0.11
## day         0.04  0.04  0.03  0.03     -0.02  -0.02      0.06       0.04
##            monthint   day
## TempC          0.24  0.04
## TempF          0.24  0.04
## DewC           0.32  0.03
## DewF           0.32  0.03
## Altimeter      0.19 -0.02
## modSLP         0.17 -0.02
## WindSpeed     -0.06  0.06
## Visibility    -0.11  0.04
## monthint       1.00  0.02
## day            0.02  1.00

## 
##  *** Regression call is: modSLP ~ Altimeter ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.96361 -0.44022 -0.03758  0.40772  1.41448 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -16.44389    0.74723  -22.01   <2e-16 ***
## Altimeter    34.41514    0.02488 1383.17   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4961 on 8767 degrees of freedom
##   (49 observations deleted due to missingness)
## Multiple R-squared:  0.9954, Adjusted R-squared:  0.9954 
## F-statistic: 1.913e+06 on 1 and 8767 DF,  p-value: < 2.2e-16

## 
##  *** Regression call is: modSLP ~ Altimeter + TempF ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.57802 -0.12674  0.00481  0.12820  0.65708 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -3.713e+00  2.803e-01  -13.25   <2e-16 ***
## Altimeter    3.403e+01  9.302e-03 3658.80   <2e-16 ***
## TempF       -2.351e-02  9.941e-05 -236.48   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1826 on 8766 degrees of freedom
##   (49 observations deleted due to missingness)
## Multiple R-squared:  0.9994, Adjusted R-squared:  0.9994 
## F-statistic: 7.086e+06 on 2 and 8766 DF,  p-value: < 2.2e-16

## 
##  *** Regression call is: TempF ~ DewF ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -11.132  -6.065  -2.083   4.042  31.029 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 11.060663   0.211976   52.18   <2e-16 ***
## DewF         1.000977   0.004677  214.02   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.987 on 8767 degrees of freedom
##   (49 observations deleted due to missingness)
## Multiple R-squared:  0.8393, Adjusted R-squared:  0.8393 
## F-statistic: 4.58e+04 on 1 and 8767 DF,  p-value: < 2.2e-16

## 
##  *** Regression call is: WindSpeed ~ Altimeter + TempF + DewF + month ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.3544  -2.5454  -0.1494   2.3669  20.6042 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 306.827720   6.726512  45.615  < 2e-16 ***
## Altimeter    -9.980907   0.222840 -44.790  < 2e-16 ***
## TempF         0.206384   0.006044  34.145  < 2e-16 ***
## DewF         -0.217533   0.006755 -32.204  < 2e-16 ***
## monthFeb      0.188758   0.204057   0.925   0.3550    
## monthMar     -1.375926   0.211244  -6.513 7.75e-11 ***
## monthApr     -2.406187   0.217870 -11.044  < 2e-16 ***
## monthMay     -3.616016   0.248432 -14.555  < 2e-16 ***
## monthJun     -4.337974   0.277532 -15.631  < 2e-16 ***
## monthJul     -3.428235   0.300872 -11.394  < 2e-16 ***
## monthAug     -2.971582   0.312087  -9.522  < 2e-16 ***
## monthSep     -1.839206   0.292614  -6.285 3.43e-10 ***
## monthOct     -0.477868   0.255640  -1.869   0.0616 .  
## monthNov     -0.013008   0.226830  -0.057   0.9543    
## monthDec      2.071836   0.200848  10.315  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.883 on 8754 degrees of freedom
##   (49 observations deleted due to missingness)
## Multiple R-squared:  0.3218, Adjusted R-squared:  0.3207 
## F-statistic: 296.7 on 14 and 8754 DF,  p-value: < 2.2e-16

# Retrieve the Las Vegas, NV (2016) data
klas_2016 <- combinedEDA("metar_klas_2016.rds")
## 
## Will use Las Vegas, NV (2016) as the description for metar_klas_2016.rds
## 
## Dropping 1 rows with 35 observations due to NA

## 
## Dropping 1 rows with 35 observations due to NA

## 
## Dropping 1 rows with 35 observations due to NA

## 
## Dropping 1 rows with 35 observations due to NA

## 
## Dropping 1 rows with 35 observations due to NA

## 
## Dropping 1 rows with 35 observations due to NA

## 
## Dropping 1 rows with 2440 observations due to NA

## 
## Dropping 1 rows with 35 observations due to NA

## 
## Dropping 1 rows with 35 observations due to NA

## 
## Dropping 1 rows with 35 observations due to NA

## 
## Dropping 1 rows with 35 observations due to NA

## 
## Removing 35 records due to NA

## 
## Removing 35 records due to NA

## 
## Removing 35 records due to NA

## 
## Removing 35 records due to NA

## 
## Removing 35 records due to NA

## 
## Removing 35 records due to NA

## 
## Removing 35 records due to NA

## 
## Removing 35 records due to NA

## 
##  *** Correlations use 8783 complete cases (99.6% of 8818 total) ***
##            TempC TempF  DewC  DewF Altimeter modSLP WindSpeed Visibility
## TempC       1.00  1.00  0.22  0.22     -0.51  -0.62      0.22       0.01
## TempF       1.00  1.00  0.22  0.22     -0.51  -0.62      0.22       0.01
## DewC        0.22  0.22  1.00  1.00     -0.24  -0.27     -0.04      -0.13
## DewF        0.22  0.22  1.00  1.00     -0.24  -0.27     -0.04      -0.13
## Altimeter  -0.51 -0.51 -0.24 -0.24      1.00   0.99     -0.38       0.06
## modSLP     -0.62 -0.62 -0.27 -0.27      0.99   1.00     -0.38       0.06
## WindSpeed   0.22  0.22 -0.04 -0.04     -0.38  -0.38      1.00      -0.02
## Visibility  0.01  0.01 -0.13 -0.13      0.06   0.06     -0.02       1.00
## monthint    0.14  0.14  0.06  0.06      0.06   0.03     -0.01      -0.02
## day        -0.01 -0.01  0.03  0.03     -0.01  -0.01      0.00      -0.07
##            monthint   day
## TempC          0.14 -0.01
## TempF          0.14 -0.01
## DewC           0.06  0.03
## DewF           0.06  0.03
## Altimeter      0.06 -0.01
## modSLP         0.03 -0.01
## WindSpeed     -0.01  0.00
## Visibility    -0.02 -0.07
## monthint       1.00  0.02
## day            0.02  1.00

## 
##  *** Regression call is: modSLP ~ Altimeter ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.1381 -0.7848 -0.0607  0.6891  3.6362 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -116.30984    1.72975  -67.24   <2e-16 ***
## Altimeter     37.68826    0.05776  652.51   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9753 on 8781 degrees of freedom
##   (35 observations deleted due to missingness)
## Multiple R-squared:  0.9798, Adjusted R-squared:  0.9798 
## F-statistic: 4.258e+05 on 1 and 8781 DF,  p-value: < 2.2e-16

## 
##  *** Regression call is: modSLP ~ Altimeter + TempF ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.18161 -0.33113  0.02395  0.33395  1.08509 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -2.537e+01  8.693e-01  -29.18   <2e-16 ***
## Altimeter    3.478e+01  2.868e-02 1212.89   <2e-16 ***
## TempF       -5.581e-02  2.813e-04 -198.42   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4165 on 8780 degrees of freedom
##   (35 observations deleted due to missingness)
## Multiple R-squared:  0.9963, Adjusted R-squared:  0.9963 
## F-statistic: 1.187e+06 on 2 and 8780 DF,  p-value: < 2.2e-16

## 
##  *** Regression call is: TempF ~ DewF ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -38.026 -14.381  -0.341  13.299  46.312 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 62.12523    0.49251  126.14   <2e-16 ***
## DewF         0.32810    0.01576   20.82   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 17.94 on 8781 degrees of freedom
##   (35 observations deleted due to missingness)
## Multiple R-squared:  0.04705,    Adjusted R-squared:  0.04694 
## F-statistic: 433.5 on 1 and 8781 DF,  p-value: < 2.2e-16

## 
##  *** Regression call is: WindSpeed ~ Altimeter + TempF + DewF + month ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.1199  -2.7665  -0.6134   2.0975  22.2630 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 316.166903   9.583584  32.990  < 2e-16 ***
## Altimeter   -10.363527   0.316290 -32.766  < 2e-16 ***
## TempF         0.012196   0.005239   2.328 0.019933 *  
## DewF         -0.053040   0.004107 -12.915  < 2e-16 ***
## monthFeb      1.477528   0.225729   6.546 6.26e-11 ***
## monthMar      0.848312   0.232319   3.652 0.000262 ***
## monthApr      1.590035   0.242532   6.556 5.84e-11 ***
## monthMay      0.327608   0.261841   1.251 0.210905    
## monthJun      0.141204   0.319290   0.442 0.658325    
## monthJul      1.092983   0.328362   3.329 0.000876 ***
## monthAug      0.545515   0.319165   1.709 0.087450 .  
## monthSep      0.647423   0.281377   2.301 0.021420 *  
## monthOct      1.147640   0.253286   4.531 5.95e-06 ***
## monthNov      1.097005   0.227198   4.828 1.40e-06 ***
## monthDec      0.672432   0.214810   3.130 0.001752 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.181 on 8768 degrees of freedom
##   (35 observations deleted due to missingness)
## Multiple R-squared:  0.1774, Adjusted R-squared:  0.176 
## F-statistic:   135 on 14 and 8768 DF,  p-value: < 2.2e-16

# Retrieve the New Orleans, LA (2016) data
kmsy_2016 <- combinedEDA("metar_kmsy_2016.rds")
## 
## Will use New Orleans, LA (2016) as the description for metar_kmsy_2016.rds
## 
## Dropping 1 rows with 14 observations due to NA

## 
## Dropping 1 rows with 14 observations due to NA

## 
## Dropping 1 rows with 14 observations due to NA

## 
## Dropping 1 rows with 14 observations due to NA

## 
## Dropping 1 rows with 14 observations due to NA

## 
## Dropping 1 rows with 14 observations due to NA

## 
## Dropping 1 rows with 1029 observations due to NA

## 
## Dropping 1 rows with 14 observations due to NA

## 
## Dropping 1 rows with 14 observations due to NA

## 
## Dropping 1 rows with 14 observations due to NA

## 
## Dropping 1 rows with 14 observations due to NA

## 
## Removing 14 records due to NA

## 
## Removing 14 records due to NA

## 
## Removing 14 records due to NA

## 
## Removing 14 records due to NA

## 
## Removing 14 records due to NA

## 
## Removing 14 records due to NA

## 
## Removing 14 records due to NA

## 
## Removing 14 records due to NA

## 
##  *** Correlations use 8799 complete cases (99.8% of 8813 total) ***
##            TempC TempF  DewC  DewF Altimeter modSLP WindSpeed Visibility
## TempC       1.00  1.00  0.85  0.85     -0.48  -0.47     -0.09       0.13
## TempF       1.00  1.00  0.85  0.85     -0.48  -0.48     -0.09       0.13
## DewC        0.85  0.85  1.00  1.00     -0.56  -0.56     -0.21      -0.06
## DewF        0.85  0.85  1.00  1.00     -0.57  -0.56     -0.21      -0.05
## Altimeter  -0.48 -0.48 -0.56 -0.57      1.00   1.00     -0.08       0.08
## modSLP     -0.47 -0.48 -0.56 -0.56      1.00   1.00     -0.08       0.08
## WindSpeed  -0.09 -0.09 -0.21 -0.21     -0.08  -0.08      1.00       0.03
## Visibility  0.13  0.13 -0.06 -0.05      0.08   0.08      0.03       1.00
## monthint    0.26  0.26  0.31  0.31      0.09   0.09     -0.22      -0.02
## day         0.03  0.03  0.07  0.07      0.02   0.02     -0.05       0.01
##            monthint   day
## TempC          0.26  0.03
## TempF          0.26  0.03
## DewC           0.31  0.07
## DewF           0.31  0.07
## Altimeter      0.09  0.02
## modSLP         0.09  0.02
## WindSpeed     -0.22 -0.05
## Visibility    -0.02  0.01
## monthint       1.00  0.02
## day            0.02  1.00

## 
##  *** Regression call is: modSLP ~ Altimeter ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.232515 -0.083700  0.000544  0.084787  0.227320 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.773456   0.230067   12.05   <2e-16 ***
## Altimeter   33.779607   0.007654 4413.27   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1015 on 8797 degrees of freedom
##   (14 observations deleted due to missingness)
## Multiple R-squared:  0.9995, Adjusted R-squared:  0.9995 
## F-statistic: 1.948e+07 on 1 and 8797 DF,  p-value: < 2.2e-16

## 
##  *** Regression call is: modSLP ~ Altimeter + TempF ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.222066 -0.082359 -0.001036  0.083533  0.217460 
## 
## Coefficients:
##              Estimate Std. Error  t value Pr(>|t|)    
## (Intercept) 1.557e+00  2.638e-01    5.903  3.7e-09 ***
## Altimeter   3.382e+01  8.667e-03 3902.029  < 2e-16 ***
## TempF       8.751e-04  9.432e-05    9.277  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.101 on 8796 degrees of freedom
##   (14 observations deleted due to missingness)
## Multiple R-squared:  0.9996, Adjusted R-squared:  0.9996 
## F-statistic: 9.833e+06 on 2 and 8796 DF,  p-value: < 2.2e-16

## 
##  *** Regression call is: TempF ~ DewF ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -16.051  -4.777  -1.265   4.199  26.477 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 24.440028   0.319858   76.41   <2e-16 ***
## DewF         0.778877   0.005065  153.77   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.763 on 8797 degrees of freedom
##   (14 observations deleted due to missingness)
## Multiple R-squared:  0.7288, Adjusted R-squared:  0.7288 
## F-statistic: 2.364e+04 on 1 and 8797 DF,  p-value: < 2.2e-16

## 
##  *** Regression call is: WindSpeed ~ Altimeter + TempF + DewF + month ***
## 
## Call:
## lm(formula = formula(myChar), data = met)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.9452  -2.5710  -0.2247   2.2592  18.9024 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 190.986029  11.812622  16.168  < 2e-16 ***
## Altimeter    -6.223347   0.387357 -16.066  < 2e-16 ***
## TempF         0.225378   0.007290  30.915  < 2e-16 ***
## DewF         -0.169883   0.006513 -26.082  < 2e-16 ***
## monthFeb     -0.787460   0.204817  -3.845 0.000122 ***
## monthMar     -0.270554   0.215758  -1.254 0.209885    
## monthApr     -1.941174   0.225642  -8.603  < 2e-16 ***
## monthMay     -3.763407   0.241926 -15.556  < 2e-16 ***
## monthJun     -5.215285   0.269852 -19.326  < 2e-16 ***
## monthJul     -5.568010   0.283515 -19.639  < 2e-16 ***
## monthAug     -4.834769   0.277122 -17.446  < 2e-16 ***
## monthSep     -5.769099   0.272789 -21.149  < 2e-16 ***
## monthOct     -4.812814   0.245841 -19.577  < 2e-16 ***
## monthNov     -2.827680   0.220138 -12.845  < 2e-16 ***
## monthDec     -0.203742   0.210552  -0.968 0.333241    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.847 on 8784 degrees of freedom
##   (14 observations deleted due to missingness)
## Multiple R-squared:  0.2278, Adjusted R-squared:  0.2266 
## F-statistic: 185.1 on 14 and 8784 DF,  p-value: < 2.2e-16